Measuring Similarity between Graphs Based on the Levenshtein Distance
نویسندگان
چکیده
Graph data has been commonly used and widely researched both in academia and industry for many applications. And measuring similarity between graphs (i.e., graph matching) is the essential step for graph searching, pattern recognition and machine vision. At present, the most widely used approach to address the graph matching problem is graph edit distance (GED). However, the computation complexity of GED is expensive and it takes unacceptable time when the graph becomes larger. Generally, graph could be canonical labeled by some sort of strings and we use the depth-first search (DFS) code as our canonical labeling system. Based on DFS codes, combining the Levenshtein distance (i.e., string edit distance, SED), we proposed a novel method for similarity measurement of graphs. Processing and calculating the distance between two DFS codes, we turned the graph matching problem into string matching, which gains great improvement on the matching performance. The experimental results prove its usefulness.
منابع مشابه
Measuring of Strategies' Similarity in Automated Negotiation
—Negotiation is a process between self-interested agents in ecommerce trying to reach an agreement on one or multi issues. The outcome of the negotiation depends on several parameters such as the agents' strategies and the knowledge one agent has about the opponents. One way for discovering opponent's strategy is to find the similarity between strategies. In this paper we present a simple model...
متن کاملA Knowledge-Rich Approach to Measuring the Similarity between Bulgarian and Russian Words
We propose a novel knowledge-rich approach to measuring the similarity between a pair of words. The algorithm is tailored to Bulgarian and Russian and takes into account the orthographic and the phonetic correspondences between the two Slavic languages: it combines lemmatization, hand-crafted transformation rules, and weighted Levenshtein distance. The experimental results show an 11-pt interpo...
متن کاملRandom Projection and Geometrization of String Distance Metrics
Edit distance is not the only approach how distance between two character sequences can be calculated. Strings can be also compared in somewhat subtler geometric ways. A procedure inspired by Random Indexing can attribute an D-dimensional geometric coordinate to any character N-gram present in the corpus and can subsequently represent the word as a sum of N-gram fragments which the string conta...
متن کاملMeasuring the Similarity of Trajectories Using Fuzzy Theory
In recent years, with the advancement of positioning systems, access to a large amount of movement data is provided. Among the methods of discovering knowledge from this type of data is to measure the similarity of trajectories resulting from the movement of objects. Similarity measurement has also been used in other data mining methods such as classification and clustering and is currently, an...
متن کاملMeasuring Musical Rhythm Similarity: Edit Distance versus Minimum-Weight Many-to-Many Matchings
Musical rhythms are represented as binary symbol sequences of sounded and silent pulses of unit-duration. A measure of distance (dissimilarity) between a pair of rhythms commonly used in music information retrieval, music perception, and musicology is the edit (Levenshtein) distance, defined as the minimum number of symbol insertions, deletions, and substitutions needed to transform one rhythm ...
متن کامل